Learning Spam: Simple Techniques For Freely-Available Software
نویسندگان
چکیده
The problem of automatically filtering out spam e-mail using a classifier based on machine learning methods is of great recent interest. This paper gives an introduction to machine learning methods for spam filtering, reviewing some of the relevant ideas and work in the open source community. An overview of several feature detection and machine learning techniques for spam filtering is given. The authors’ freely-available implementations of these techniques are discussed. The techniques’ performance on several different corpora are evaluated. Finally, some conclusions are drawn about the state of the art and about fruitful directions for spam filtering for freely-available UNIX software practitioners.
منابع مشابه
The Fight against Spam - A Machine Learning Approach
The paper presents a brief survey of the fight between spammers and antispam software developers, and also describes new approaches to spam filtering. In the first two sections we present a survey of the currently existing spam types. Some well-mapped spammer tricks are also described, although the imagination of spam distributors is endless, and therefore only the most common tricks are covere...
متن کاملStudy on the effectiveness of anomaly detection for spam filtering
Spam has become an important problem for computer security because it is a channel for spreading threats, including computer viruses, worms and phishing. Currently, more than 85% of received emails are spam. Historical approaches to combating these messages, including simple techniques such as sender blacklisting or using email signatures, are no longer completely reliable on their own. Many so...
متن کاملA dynamic model for integrating simple web spam classification techniques
Over the last years, Internet spam content has spread enormously inside web sites mainly due to the emergence of new web technologies oriented towards the online sharing of resources and information. In such a situation, both academia and industry have shown their concern to accurately detect and effectively control web spam, resulting in a good number of anti-spam techniques currently availabl...
متن کاملTraining SpamAssassin with Active Semi-supervised Learning
Most spam filters include some automatic pattern classifiers based on machine learning and pattern recognition techniques. Such classifiers often require a large training set of labeled emails to attain a good discriminant capability between spam and legitimate emails. In addition, they must be frequently updated because of the changes introduced by spammers to their emails to evade spam filter...
متن کاملA Survey on Various Classifiers Detecting Gratuitous Email Spamming
Email becomes the major source of communication these days. Most humans on the earth use email for their personal or professional use. Email is an effective, faster and cheaper way of communication. The importance and usage for the email is growing day by day. It provides a way to easily transfer information globally with the help of internet. Due to it the email spamming is increasing day by d...
متن کامل